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DETAILED ACTION 
Double Patenting 

1 . The nonstatutory double patenting rejection is based on a judicially created 
doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the 
unjustified or improper timewise extension of the "right to exclude" granted by a patent 
and to prevent possible harassment by multiple assignees. See In re Goodman, 1 1 
F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 

USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 
1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970);and, In re Thorington, 
418 F.2d 528, 163 USPQ 644 (CCPA 1969). 

A timely filed terminal disclaimer in compliance with 37 CFR 1 .321(c) may be 
used to overcome an actual or provisional rejection based on a nonstatutory double 
patenting ground provided the conflicting application or patent is shown to be commonly 
owned with this application. See 37 CFR 1.130(b). 

Effective January 1 , 1994, a registered attorney or agent of record may sign a 
terminal disclaimer. A terminal disclaimer signed by the assignee must fully comply with 
37 CFR 3.73(b). 

2. Claims 1 -54 are rejected under the judicially created doctrine of obviousness- 
type double patenting as being unpatentable over claims 1-24 of U.S. Patent No. 
6,684,205 and claims 1-8 of U.S. Patent No. 6,862,586. Although the conflicting claims 
are not identical, they are not patentably distinct from each other because claims of U.S. 
Patent No. 6,684,205 and 6,862,586 contain every element of claims 1-54 of the instant 
specification. 

"A later patent claim is not patentably distinct from an earlier patent claim if the 
later claim is obvious over, or anticipated by, the earlier claim. In re Longi, 759 F.2d at 



896, 225 USPQ at 651." 
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Claim Rejections - 35 USC § 102 

3. The following is a quotation of the appropriate paragraphs of 35 U.S.C. 1 02 that 
form the basis for the rejections under this section made in this Office action: 

A person shall be entitled to a patent unless - 

(a) the invention was known or used by others in this country, or patented or described in a printed 
publication in this or a foreign country, before the invention thereof by the applicant for a patent. 

4. Claims 1-6, 15-19, 28, 38-40 and 42-46 are rejected under 35 U.S.C. 102(a) as 
being anticipated by Kuo et al., Web Document Classification based on Hyperlinks and 
Document Semantics, August 2000, pages 44-51 and hereinafter referred to as Kuo. 

As to claims 1,3,15,28 and 42, Kuo teaches a method of searching a database 
containing hypertext documents, said method comprising: searching said database 
using a query to produce a set of hypertext documents, and clustering said set of 
hypertext documents into various clusters such that documents within each cluster are 
similar to each other, wherein said clustering is based upon words contained in each 
hypertext document, out-links from each hypertext document, and in-links to each 
hypertext document (i.e., "Web documents can usually be divided into disjoint sets based on 
their document content. This problem is known as Web document Classification or Web 
Document Categorization. Besides the content, web document also contains a set of hyperlinks 
that points to other web documents. These sets of hyperlinks can provide information about 
inter-relationship among web documents. In this paper, we will propose an algorithm to partition 
a set of web documents based on their networked hyperlink structure. We will also present a 
similarity definition between documents that is based on the document content to measure the 



Application/Control Number: 10/660,242 Page 4 

Art Unit: 2165 

similarities between documents within a partition. The definition of similarity can be used to 
prune some irrelevant documents away in order to maintain the consistency of the document 
subset. . . . Importance defines how important the document is among the set of documents by 
considering a document's in-link and out-link. If importance is large, it means the document is 
an important document, which is of high inter-relation with other documents. ... we will focus 
on the similarity between documents within a subset of documents and retrieve the most 
important document (the representative) in a subset. Due to the unstructured property of web 
documents, ..." The preceding text excerpts clearly indicate that a set of web documents 
retrieved from web databases are clustered into various clusters such that documents within each 
cluster are similar to each other (i.e., discriminated and disambiguated from others). The 
similarity is decided by their word content and their hyperlinks i.e., in-link and out-link. A 
document subset/dictionary is created based on shared similar words or hyperlink status. The 
subsets/dictionary are pruned when the similarity threshold (i.e., word or hyperlink network i.e., 
in-link or out-link) is crossed to maintain the consistency of the documents in the cluster. The 
relative importance of documents are determined based on the content i.e., words or hyperlinks 
i.e., in-links and out-links among other documents in the subset.) (page 44; page 45; page 46; 
page 48). 

As to claims 2,16 and 43, Kuo teaches wherein said set of hypertext documents 
comprises a collection of unstructured, unlabeled documents and said clustering 
organizes said set of hypertext documents into labeled categories that are discriminated 
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and disambiguated from each other (see explanations above) (page 44; page 45; page 46; 
page 48). 

As to claims 4,17 and 44, Kuo teaches wherein said hypertext documents are 
considered similar if said hypertext documents share one or more of said words, said 
out-links, and said in-links (see explanations above) (page 44; page 45; page 46; page 48). 

As to claims 5,18 and 45, Kuo teaches wherein said clustering includes 
determining a relative importance of said words, said out-links, and said in-links in an 
adaptive, data-driven process (see explanations above) (page 44; page 45; page 46; page 48). 

As to claims 38,39 and 40, Kuo teaches pruning function words from said word 
dictionary (see explanations above) (page 44; page 45; page 46; page 48). 

Claim Rejections - 35 USC § 103 

5. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

6. Claims 6-14,19-27 and 46-54 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Kuo as discussed in the rejections above in view of Pirolli et al., Silk 
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from Sow's Ear: Extracting Usable Structures from the Web, CHI 1996, pages 1-9 and 
hereinafter referred to as Pirolli. 

As to claims 6-14, 19-27 and 46-54, Kuo does not explicitly teach annotating 
each cluster using information nuggets. 

Pirolli teaches annotating each cluster using information nuggets (i.e., "We 
developed methods for annotating pages with their functional types and relevancy/importance 
assessments as well as aggregating the Web into collections which can be treated as collections. 
... In particular, we have designed methods for classifying nodes into a number of functional 
categories, spreading relevance based on selecting one or more source nodes and dimensions of 
interest . . . The degree of relevance of Web pages to one another can be conceived as similarities 
among Web pages located in abstract space. . . . One type of graph structure represents the link 
topology of a Web locality by using arcs labeled with unit strengths to connect one graph node to 
another when there exists a hypertext link between the corresponding Web pages. ... A second 
type of graph structure represents the inter-page text content similarity by labeling arcs 
connecting nodes with the computed text similarities between corresponding Web pages. This is 
common way of conceptualizing documents in search-based information retrieval. . . . This 
analysis produces an adjacency matrix for the particular locality. . . . From this, a vector that 
contains each node's frequency of requests and a matrix containing the number of traversals 
from one page to another are computed using software that identifies . . . Another source of 
information about relationship between pages is the similarity of their textual content. 
Techniques from information retrieval [9] can be straightforwardly applied to calculate a 
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similarity matrix which provides a usable measure of this variable. . . . The solution path we took 
to determine the set of pages that comprise each class folds in the above usage, textual similarity, 
and meta-information for each item in the Xerox Web space. Specifically, a new matrix was 
created with each row representing an item and the columns representing the item's: . size, in 
bytes, of the item . inlinks, the number of hyperlinks that point to the item from the Xerox Web 
space . outlinks, the number of hyperlinks the item contains that point to other items in the 
Xerox Web space. . . . Ci=Wl V1+W2V2+ . . . +WnVn (1) for all nodes I in Xerox Web space, 
where Vj are the measured features of each Web page, and the Wj are weights." The preceding 
text excerpts clearly indicate that collections/clusters/classes where web pages are similar are 
annotated based on their similarity in hyperlinks i.e. ih-links and out-links/citation and references 
and textual contents i.e., words. Similarity vector/matrixes (i.e., a vector is a column matrix) are 
created using features e.g., hyperlink i.e., in-link, out-link, citation, references and their 
frequency i.e. how typical the feature is and weights.) (page 1, column 1; page 2, column 1, 
column 2; page 3, column 1, column 2; page 5, column 1, column 2). 

It would have been obvious to a person of ordinary skill in the art at the time of 
Applicant's invention to modify the teachings of Kuo with the teachings of Pirolli to 
include annotating each cluster using information nuggets with the motivation to 
harness both the topology and textual similarity between items as well as integrate new 
analyses based upon a WWW space (Pirolli, page 1, column 1). 
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7. Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Apu M. Mofiz whose telephone number is (571) 272- 
4080. The examiner can normally be reached on Monday - Thursday 8:00 A.M. to 4:30 
P.M. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Jeffrey Gaffin can be reached at (571 ) 272-4146. The fax numbers for the 
group is (571)273-8300. 

Any inquiry of a general nature or relating to the status of this application should 
be directed to the Group receptionist whose telephone number is (703) 305-9600. 




Ami M. Mofiz ^ 
Primary Patent Examiner 
Technology Center 2100 
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